Low-parameter phylogenetic inference under the general markov model.

نویسندگان

  • Barbara R Holland
  • Peter D Jarvis
  • Jeremy G Sumner
چکیده

In their 2008 and 2009 articles, Sumner and colleagues introduced the "squangles"-a small set of Markov invariants for phylogenetic quartets. The squangles are consistent with the general Markov (GM) model and can be used to infer quartets without the need to explicitly estimate all parameters. As the GM model is inhomogeneous and hence nonstationary, the squangles are expected to perform well compared with standard approaches when there are changes in base composition among species. However, the GM model assumes constant rates across sites, so the squangles should be confounded by data generated with invariant sites or other forms of rate-variation across sites. Here we implement the squangles in a least-squares setting that returns quartets weighted by either confidence or internal edge lengths, and we show how these weighted quartets can be used as input into a variety of supertree and supernetwork methods. For the first time, we quantitatively investigate the robustness of the squangles to breaking of the constant rates-across-sites assumption on both simulated and real data sets; and we suggest a modification that improves the performance of the squangles in the presence of invariant sites. Our conclusion is that the squangles provide a novel tool for phylogenetic estimation that is complementary to methods that explicitly account for rate-variation across sites, but rely on homogeneous-and hence stationary-models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model selection and parameter inference in phylogenetics using Nested Sampling

Bayesian inference methods rely on numerical algorithms for both model selection and parameter inference. In general, these algorithms require a high computational effort to yield reliable inferences. One of the major challenges in phylogenetics regards the estimation of the marginal likelihood. This quantity is commonly used for comparing different evolutionary models, but its calculation, eve...

متن کامل

Empar: EM-based algorithm for parameter estimation of Markov models on trees

The goal of branch length estimation in phylogenetic inference is to estimate the divergence time between a set of sequences based on compositional differences between them. A number of software is currently available facilitating branch lengths estimation for homogeneous and stationary evolutionary models. Homogeneity of the evolutionary process imposes fixed rates of evolution throughout the ...

متن کامل

A Semialgebraic Description of the General Markov Model on Phylogenetic Trees

Many of the stochastic models used in inference of phylogenetic trees from biological sequence data have polynomial parameterization maps. The image of such a map—the collection of joint distributions for all parameter choices—forms the model space. Since the parameterization is polynomial, the Zariski closure of the model space is an algebraic variety which is typically much larger than the mo...

متن کامل

AWTY (are we there yet?): a system for graphical exploration of MCMC convergence in Bayesian phylogenetics

UNLABELLED A key element to a successful Markov chain Monte Carlo (MCMC) inference is the programming and run performance of the Markov chain. However, the explicit use of quality assessments of the MCMC simulations-convergence diagnostics-in phylogenetics is still uncommon. Here, we present a simple tool that uses the output from MCMC simulations and visualizes a number of properties of primar...

متن کامل

Limitations of Markov chain Monte Carlo algorithms for Bayesian Inference of phylogeny

Markov Chain Monte Carlo algorithms play a key role in the Bayesian approach to phylogenetic inference. In this paper, we present the first theoretical work analyzing the rate of convergence of several Markov Chains widely used in phylogenetic inference. We analyze simple, realistic examples where these Markov chains fail to converge quickly. In particular, the studied data is generated from a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Systematic biology

دوره 62 1  شماره 

صفحات  -

تاریخ انتشار 2013